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TITLE OF THE INVENTION 

DELTA 6 FATTY ACID DES ATURASE 

CROSS-REFERENCE TO RELATED APPLICATIONS 
5 Not applicable. 

STATEMENT REGARDING FEDERALLY-SPONSORED R&D 
Not applicable. 

10 REFERENCE TO MICROFICHE APPENDIX 
Not applicable. 

FIELD OF THE INVENTION 

The present invention is directed to novel human DNA sequences 
1 5 encoding a delta 6 fetty acid desaturase, an enzyme involved in the synthesis of 
essential fatty acids. 

BACKGROUND OF THE INVENTION 

Essential fatty acids (EFAs) are polyunsaturated fatty acids that cannot 

20 be manufactured by mammals, yet are required for a number of important biochemical 
processes, and thus must be supplied in the diet. The most important dietary EFAs 
are linoleic acid and alpha-linolenic acid (ALA). These two EFAs undergo a number 
of biosynthetic reactions that convert them into various other EFAs. Figure 1 depicts 
the biosynthetic reactions involving the two groups of EFAs, the n-6 EFAs (linoleic 

25 acid derivatives) and the n-3 EFAs (ALA derivatives). EFAs are formed from linoleic 
acid and ALA by a series of alternating reactions involving the removal of two 
hydrogens coupled with the insertion of an additional double bond (desaturation) and 
the lengthening of the fatty acid chain by the addition of two carbons (chain 
elongation). The enzymes catalyzing the desaturations and elongations are thought to 

30 be the same for both groups of EFAs. 

Among the more important unsaturated fatty acids are the delta 6 
unsaturated fatty acids, which are involved in the maintenance of membrane structure 
and function, the regulation of cholesterol synthesis and transport, and the prevention 



I, 
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of water loss from the skin. Delta 6 unsaturated fatty acids also serve as precursors of 
the eicosanoids, including the prostaglandins and leukotrienes (Honx)bin, 1992, Prog. 
Lipid Res. 3 1 : 163-1 94). The double bond at the 6 position of delta 6 unsaturated fatty 
acids is introduced by a class of enzymes known as delta 6 desaturases. 
5 Deficiencies in linoleic acid and ALA derivatives have been associated 

with skin diseases, diabetic compUcations, inflammatory and autoimmune disorders, 
cardiovascular disorders, complications of viral infection, and retinal dysfunction. 
For example, a deficiency in ganmia-linolenic acid (GLA), which is produced from 
linoleic acid by the action of the enzyme delta 6 desaturase, can arise fh>m the 
1 0 decreased activity of this enzyme that occurs in a^ng, stress, diabetes, eczema, and 
some infections, or from increased catabolism of GLA due to oxidation or rapid cell 
division, as occurs in inflanmiation or cancer. Clinical trials have demonstrated that 
dietary GLA supplementation can be effective in treating a number of conditions that 
are associated with GLA deficiency, e.^., atopic eczema, mastalgia, diabetic 
1 5 neuropathy, viral infections, and some forms of cancer (Hoirobin, 1990, Rev. 
Contemp. Pharmacother. 1:1-45). 

Delta 6 desaturase is an example of a fatty acid desaturase. Fatty acid 
desaturases are enzymes that introduce a double bond into the carbon chain of fatty 
acids. They play vital roles in the biosynthesis of polyunsaturated fatty acids, 
20 including the essential fatty acids. Fatty acid desaturases are present in soluble and 
membrane-associated forms and require electron donors (for example, cytochrome 
bS) for their functioning. 

Delta 6 desaturases catalyze the rate-limiting steps in the biosyntheses 
of the linoleic and ALA group EFAs shown in Figure 1. End products of the linoleic 
25 acid pathway include the eicosanoids (prostaglandins and leukotrienes). The end 
product of the ALA pathway is docosahexaenoic acid (DHA), an important 
component of membranes in the vertebrate retina. DHA is highly specific for retina 
and represents more than 50% of the fatty acids in the rod outer segment (ROS). It 
appears that DHA is important in maintaining the normal structure and function of the 
30 retina (Anderson et al., 1 992, Neurobiology of Essential Fatty Acids, Bazan et al., 
eds.. Plenum Press, New York, pages 285-294). Increased dietary consumption of 
DHA and its precursor, eicosapentaenoic acid, ftx>m seal meat and fish has been 
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linked to an increased incidence of macular degeneration in Greenland Eskimos 
(Rosenberg, 1 987, Arct. Med. Res. 46:64-70). 

Certain delta 6 desaturases have been cloned from plants. For 
example, a delta 6 desaturase has been cloned from borage (Sayanova et aL, 1997, 

5 Proc. Natl. Acad. Sci. USA 94:421 1-4216). This delta 6 desaturase is unusual in that 
its cytochrome b5 electron donor is present as an N-terminal extension of the enzyme 
rather than being synthesized as a separate protein. The borage delta 6 desaturase has 
been shown to be functional, in that transfer of the cloned gene encoding it to tobacco 
results in the synthesis of high levels of GLA and octadecatetraenoic acid (OTA) in 

10 the transgenic tobacco leaves. GLA and OTA are the products of delta 6 desaturase 
activity on linoleic acid and ALA, respectively. 

Based on its hydropathy profile, the borage delta 6 desaturase appeairs 
to be a membrane-bound protein. Examination of the amino acid sequence of the 
borage enzyme, as well as the amino acid sequences of membrane-bound desaturases 

1 5 from a wide variety of organisms, has revealed three regions of conserved short 
motifs containing histidine residues (HX(3 or 4)H.HX(2 or 3)HH, and HX(2 or 
3)HH) having a conserved spacing from each other (Shanklin et al.. Biochemistry, 

1994,33:12787-12794). 

A DNA sequence has been isolated from sunflower embryos that, 
20 judging from its sequence, appears to encode a delta 6 desaturase having a 

cytochrome b5-like moiety fused to itsN-teminus (Sperling et al., 1995, Eur. J. 
Biochem. 232:798-805). 

SUMMARY OF THE INVENTION 

25 The present invention is directed to novel human DNA sequences that 

encode a deha 6 fatty acid desaturase, cytochrome bS-related protein (CYB5RP). The 
present invention includes genomic CYB5RP DNA as well as cDNA that encodes the 
CYB5RP protein. The genomic CYB5RP DNA is substantially free from other 
nucleic acids and has the nucleotide sequence shown in SEQ.ID.NO.:l . The cDNA 

30 encoding CYB5RP protein is substantially free from other nucleic acids and has the 
nucleotide sequence shown in SEQ.ID.NO.:2. Also provided is CYB5RP protein 
encoded by the novel DNA sequences. The CYB5RP protein is substantially free 
from other proteins and has the amino acid sequence shown in SEQ.ID.NO.:3. 
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Methods of expressing CYB5RP protein in recombinant systems are provided. Also 
provided are methods of producing delta 6 unsaturated fatty acids using DNA 
encoding CYB5RP or using CYB5RP protein. 

5 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts the enzymatic conversions involved in the linoleic 
acid (n-3) and alpha-linolenic acid (n-6) pathways of essential fatty acid synthesis. 

Figure 2A-G shows the genomic DNA sequence of the CYB5RP gene 
(SEQ.ID.NO.:l). Underlined nucleotides in capitals represent exons. The start ATG 

10 codon at position 544 in exon 1 and the stop TGA codon at position 18.103 in exon 
12 are shown in bold. The putative polyadenylation signal ATTAAA located 
approximately 20 base pairs upstream of the polyA tail is shown in bold italics 
(position 18,373 in exon 12). DNA sequence upstream of exon 1 represents a 
putative promoter region of the CYB5RP gene., as indicated by the presence of the 

1 5 TATA box at position 353 (underlined bold).. 

Figure 3A-C shows the cDNA sequence (SEQ.ID.NO.:2) and the 
amino acid sequence (SEQ.ID.NO.:3) of CYB5RP. The region encompassing amino 
acids 1-102 represents the cytochrome b5 domain. The region encompassing amino 
acids 1 82-1 86 represents HIS BOX 1 . The region encompassing amino acids 219-223 

20 represents HIS BOX 2. The region encompassing amino acids 383-387 represents 
HIS BOX 3. 

Figure 4 shows a portion of the cDNA sequence (SEQ.lD.NO.:4) and a 
portion of the amino acid sequence (SEQ.ID.NO.:5) of mouse CYB5RP. 

Figure 5 A shows a Kyte-Doolittle hydropathy plot of CYB5RP. 
25 Figure 5B shows the proposed membrane topology of CYB5RP based on its 
hydropathy plot. This membrane topology is similar to that proposed for other 
membrane-bound fatty acid desaturases (Shanklin et al.. Biochemistry. 1994. 
33:12787-12794). The amino acids shown in Figure SB are portions of 
(SEQ.ID.NO.:3). 

3Q Figure 6 shows the output of the Profilescan program from the 

Wisconsin GCG package. The upper amino acid sequence is fix)m CYB5RP 
(positions 31-78 of SEQ. ID. N0.3). The lower amino acid sequence is positions 1- 
48 of the cytochrome b5 profile (SEQ. ID. N0.:6.). The output shows that CYB5RP 
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contains a profile typical for the heme-binding domain of the cytochrome b5 protein 
family. Importantly, the region of identity includes the invariant HPGG motif,.where 
histidine represents a heme axial ligand for iron. 

Figure 7 A and B show the results of BlastP searches of the GenBank 
5 database using the full-length CYB5RP amino acid sequence as the query. Figure 7A 
shows the hit with highest homology, a hypothetical protein from sunOower. The 
sunflower protein and CYB5RP share three His boxes (boxed) in which the spacing 
between the His boxes is conserved. Also boxed is the HPGG motif typical for the 
heme-binding domain of the cytochrome b5 protein family. In both proteins the first 
10 histidine of the third His box is replaced by glutamine (a typical feature of desaturases 
with delta 6 specificity). The upper amino acid sequences shown are fit>m CYB5RP 
and are portions of SEQ. ID. N0.3. The lower amino acid sequences shown are 
portions of the amino acid sequence of the hypothetical protein from sunflower 
(Sperling et al., 1995, Eur. J. Biochem. 232:798-805). The sequence shown as 
15 positions 348-432 is SEQ. ID. NO.:7. The sequence shown as positions 22-74 is 
SEQ. ID. NO.:8. The sequence shown as positions 152-227 is SEQ. ID. NO.:9. 
Figure 7B shows the hit with the second highest homology, a delta 6 desaturase from 
Borago oficinalis (Sayanova et al., 1997, Proc. Natl. Acad. Sci. USA 94:421 1-4216). 
The Borago protein and CYB5RP also share three His boxes with conserved spacing, 
20 as well as the HPGG motif In both proteins the first histidine of the third His box is 
replaced by glutamine (a typical feature of desaturases with delta 6 specificity). The 
upper amino acid sequences shown are from CYB5RP and are portions of SEQ. ID. 
N0.3. The lower amino acid sequences shown are portions of the amino acid 
sequence of the Borago delta 6 desaturase. The sequence shown as positions 338-424 
25 is SEQ. ID. NO.: 10. The sequence shown as positions 12-64 is SEQ. ID. NO.: 11. 
The sequence shown as positions 153-220 is SEQ, ID. NO.: 12. 

Figure 8 shows additional results of BlastP searches of the GenBank 
database using the CYB5RP protein as the query. Figure 8 shows the amino acid 
alignment between the CYB5RP protein and a delta 6 desaturase from Synechocystis 
30 sp. (strain pec 6803) performed by the BlastP program. The Synechocystis delta 6 
desaturase and CYB5RP share three His boxes, two of which are shown in Figure 8 
(boxed). In both proteins the first histidine of the third His box is replaced by 
glutamine (a typical feature of desaturases with delta 6 specificity). The CYB5RP 
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sequence shown is a portion of SEQ. ID. N0.3. The Synechocystis sequence shown is 
SEQ. ID.NO:13. 

Figure 9A shows the expression pattern of the CYB5RP gene in 9 
human tissues, as determined by RT-PCR ampUfication with 21 cycles. Expression is 
5 detected in human retina, kidney, pancreas, placenta, and brain. Figure 9B shows the 
results of the analogous experiments performed with 25 cycles of amplification. 
Expression of the CYB5KP gene is seen in all the human tissues studied. 

DETAILED DESCRIPTION OF THE INVENTION 
\ 0 For the purposes of this invention: 

"Substantially fiee fiom other proteins" means at least 90%. preferably 
95%, more preferably 99%. and even more preferably 99.9%, free of other proteins. 
Thus, a CYB5RP protein preparation that is substantially free fix>m other proteins will 
cont^n, as a percent of its total protein, no more than 10%, preferably no more than 
15 5%, more preferably no more than 1%, and even more preferably no more than 0.1%, 
of non-CYB5RP proteins. Whether a given CYB5RP protein preparation is 
substantially free from other proteins can be determined by such conventional 
techniques of assessing protein purity as. e.g., sodium dodecyl sulfate polyacrylamide 
gel electrophoresis (SDS-PAGE) combmed with appropriate detection methods, e.g., 
20 silver staining or inimunoblotting. 

"Substantially free fiom ofter nucleic acids" means at least 90%, 
preferably 95%, more preferably 99%, and even more preferably 99.9%, free of other 
nucleic acids. Thus, a CYB5RP DNA preparation that is substantially free fiom other 
nucleic acids will contain, as a percent of its total nucleic acid, no more than 10%, 
25 preferably no more than 5%, more preferably no more than 1 %, and even more 
preferably no more than 0.1%, of non-CYB5RP nucleic acids. Whether a given 
CYB5RP DNA preparation is substantially free fiom other nucleic acids can be 
determined by such conventional techniques of assessing nucleic acid purity as, e.g., 
agarose gel electrophoresis combined with appropriate staining methods, e.g., 
30 ethidium bromide staining, or by sequencing. 

"Substantially the same biological activity as CYB5RP" means being 
able to introduce a double bond into the 6 position of linoleic acid under conditions in 
which CYB5RP is able to introduce a double bond into the 6 position of linoleic acid. 
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A "conservative amino acid substitution" refers to the replacement of 
one amino acid residue by another, chemically similar, amino acid residue. Examples 
of such conservative substitutions are: substitution of one hydrophobic residue 
(isoleucine, leucine, valine, or methionine) for another, substitution of one polar 
5 residue for another polar residue of the same charge (e.g., arginine for lysine- 
glutamic acid for aspartic acid); substitution of one aromatic amino acid (tryptophan, 
tyrosine, or phenylalanine) for another. 

The present invention relates to the identification and cloning of 
cytochrome b5-related protein (CYB5RP), a gene which encodes a human delta 6 
1 0 fatty acid desaturase. The gene is present on PAC clones 759J12, 756B3, 5 1 9013, 
and 466A1 1 from an area of human chromosome 1 lql2 that has been shown to 
contain a gene related to Best's macular dystrophy (Cooper et aL, 1997, Genomics 
41:185-192; Stohr et aL, 1997, Genome Res. 8:48-56; Graff er a/., 1997, Hum. Genet. 
101 : 263-279). This linkage between the chromosomal location of the CYB5RP gene 
15 and the location of the gene related to Best's macular dystrophy can be used 

diagnostically by identifying restriction fragment length polymorphisms (RFLPs) in 
the vicinity of the CyB5RP gene, e.g., in SEQ.ID.NO.:l . Such RFLPs will be 
associated with the Best's macular dystrophy gene and thus can be used to identify 
individuals carrying disease-causing forms of the Best's macular dystrophy gene. 
20 CYB5RP was identified as an EST hit in sequence scanning data from 

PAC clones from human chromosome 1 lql2. In addition, a full length cDNA of 
CYB5RP was recovered fitjm a human retina cDNA library. The genomic region of 
CYB5RP has been sequenced and the exon/intron organization of CYB5RP has been 
determined. The CYB5RP gene has 12 exons. The promoter region of CYB5RP was 
25 identified upstream of the 5' UTR by detecting consensus elements required for 

eukaryotic transcription. The expression pattern of CYB5RP was determined by RT- 
PCR analysis in 9 human tissues. The CYB5RP gene is expressed predominantly in 
human retina, kidney, pancreas, and placenta; lower levels of expression are also 
detected in brain,, heart, lung, liver, and skeletal muscle. Bioinformatic analysis 
30 revealed significant homology to a group of plant and bacterial fatty acid desaturases. 
All of the typical amino acid motifs present in these fatty acid desaturases are also 
present in CYB5RP. Kyte-Doolittle algorithm analysis predicts a transmembrane 
organization typical of fatty acid desaturases for CYB5RP (see Figure 5). CYB5RP is 
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unusual in that it contains a cytochrome b5 region in its N terminus. While many 
fatty acid desaturases utilize cytochrome b5 as an electron donor, most have not 
incorporated this cytochrome as part of then- polypeptide chain. 

That CYB5RP is a fatty acid desaturase is shown by the following 

S evidence: 

( 1 ) CYB5RP possesses significant homology to a group of plant 
and microbial fatty acid desaturases; 

(2) Like other fatty acid desaturases, CYB5RP has three conserved 
histidine boxes, with correct spacing between the boxes; and 

10 (3) The predicted membrane topology of CYB5RP is similar to 

that of known fatty acid desaturases. 

That CYB5RP is a delta 6 fatty acid desaturase is shown by the 

following evidence: 

(1) CYB5RP contains a cytochrome b5-like moiety fused to its N- 
1 5 terminus. The only two fatty acid desaturases that contain cytochrome b5-like moiety 

fused to their N-termini are known or suspected to be delta 6 desaturases. 

(2) The only two plant desaturases that are known or suspected to 
introduce a double bond in the 6 position have an atypical His box 3 (QL1.EHH), with 
a Q in the first position rather than an H. CYB5RP has the same atypical His Box 3. 

20 (3) The only bacterial desaturase that is known to introduce a 

double bond in the 6 position has an atypical His box 3 (QVTHH), with a Q in the 
first position rather than an H. CYB5RP has the same atypical His Box 3. 

CYB5RP is a target for the development of drugs for the treatment of 
disorders of lipid metabolism and for the treatment of conditions that require the 
25 modulation of the biosynthesis of prostaglandins and leukotrienes (asthma, pain, etc). 
CYB5RP is also a target for the development of drugs for use in treating skin 
diseases, diabetic complications, reproductive disorders, including breast pain and 
premenstrual syndrome, inflammatory and autoimmune disorders, cardiovascular 
disorders, complications of viral infections, and various forms of retinal degeneration, 
30 including age-related macular degeneration. 

CYB5RP is homologous to a delta 6 desaturase from Borago oficinalis 
(see Figure 7B). Both CYB5RP and this Borago delta 6 desaturase, unlike 
desaturases from higher plants, are unusual in containing a cytochrome b5-like 
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domain fused to their N-termini (Sayanova et al., 1997, Proc. Natl. Acad. Sci. USA 
94:421 1-421 6; hereinafter "Sayanova'*). The Borago desaturase has been expressed 
in transgenic tobacco, resulting in high levels of delta 6 desaturated fatty acids in the 
transgenic tobacco leaves, including high levels of y-linolenic acid (GLA) (Sayanova). 
5 Given the medical importance of GLA, Sayanova proposed that transgenic plants, 
expressing the Borago delta 6 desaturase, would be valuable as sources of GLA. 
Similarly, CYB5RP, expressed in transgenic plants, is expected to provide a valuable 
source of GLA. 

The present invention provides DNA encoding CYB5RP that is 

10 substantially free from other nucleic acids. The present invention also provides 
recombinant DNA molecules encoding CYB5RP. The present invention provides 
DNA molecules substantially free from other nucleic acids comprising the nucleotide 
sequence shown in Figure 2 as SEQ.ID.NO.:l. Analysis of SEQ.ID.NO.:! revealed 
that this genomic sequence defines a gene having 12 exons. These exons collectively 

15 have an open reading frame that encodes a protein of 445 amino acids. When an 
alternatively spliced exon 8 is used, a CYB5RP protein of 433 amino acids, lacking 
amino acids 317-328, is produced. Thus, the present invention includes two cDNA 
molecules, encoding two forms of CYB5RP protein, that are substantially free from 
other nucleic acids. The first cDNA is shown in Figure 3 and has the nucleotide 

20 sequence SEQ.ID.NO. :2. The second cDNA is identical to the first, except that it 
does not contain the nucleotides at positions 1,019-1,054. 

The present invention includes DNA molecules substantially free fix>m 
other nucleic acids comprising the coding region of SEQ.ID.NO.:2. Accordingly, the 
present invention includes DNA molecules substantially free from other nucleic acids 

25 having a sequence comprising positions 71 -1 ,405 of SEQ.ID.NO.:2. The present 
invention also includes DNA molecules substantially fi^e from other nucleic acids 
having a sequence comprising positions 71-1,405 of SEQ.ID.NO.:2, except that the 
nucleotides at positions 1,019-1,054 are missing. Also included in the present 
invention are recombinant DNA molecules having a nucleotide sequence comprising 

30 positions 71-1 ,405 of SEQ.ID.NO. :2 and recombinant DNA molecules having a 
nucleotide sequence comprising positions 71-1,405 of SEQ.ID.NO.:2 with the 
exception that positions 1,019-1,054 are missing. 
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The novel DNA sequences of the present invention encoding 
CYB5RP, in whole or in part, can be linked with other DNA sequences, Le.. DNA 
sequences to which CYB5RP is not naturally linked, to form "recombinant DNA 
molecules" encoding CYB5RP. Such other sequences can include DNA sequences 
5 that control transcription or translation such as, e.g., translation initiation sequences, 
promoters for RNA polymerase n, transcription or translation termination sequences, 
enhancer sequences, sequences that control rephcation in microorganisms, sequences 
that confer antibiotic resistance, or sequences that encode a polypeptide 'tag" such as, 
e.g., a polyhistidine tract or the myc epitope. The novel DNA sequences of the 
10 presem invention can be inserted into vectors such as plasmids, cosmids, viral 
vectors, PI artificial chromosomes, or yeast artificial chromosomes. 

deluded in the present invention are DNA sequences that hybridize to 
at least one of SEQ.ID.NOs.:l or 2 under stringent conditions. By way of example, 
and not limitation, a procedure using conditions of high stringency is as follows: 
15 Prehybridization of filtere containing DNA is carried out for 2 hr. to overnight at 65»C 
in buffer composed of 6X SSC. 5X Denhardt's solution, and ICQ ^^g/ml denatured 
salmon sperm DNA. Filters are hybridized for 12 to 48 hrs at 65"C in 
prehybridization mixture containing 100 jig/ml denatured sahnon sperm DNA and 5- 
20 X 106 cpm of 32p-iabeled probe. Washing of filters is done at 37»C for 1 hr in a 
20 solution containing 2X SSC, 0.1% SDS. This is followed by awash in O.IX SSC. 
0.1% SDS at 50°C for 45 min. before autoradiography. 

Other procedures using conditions of high stringency would include 
either a hybridization carried out in 5XSSC, 5X Denhardt's solution, 50% formamide 
at 42°C for 12 to 48 hours or a washing step carried out in 0.2X SSPE, 0.2% SDS at 
25 65°C for 30 to 60 minutes. 

Reagents mentioned in the foregoing procedures for carrying out high 
stringency hybridization are well known in the art. Details of the composition of 
these reagents can be found in, e.g., Sambrook, Fritsch, and Maniatis, 1989. 
Molecular Cloxiing: A Laboratow Manual , second edition. Cold Spring Harbor 
30 Laboratory Press. In addition to the foregoing, other conditions of high stringency 
which may be used are well known in the art. 

The degeneracy of the genetic code is such that, for all but two amino 
acids, more than a single codon encodes a particular amino acid. This allows for the 
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construction of synthetic DNA that encodes the CYB5RP protein where the 
nucleotide sequence of the synthetic DNA differs significantly from the nucleotide 
sequence of SEQ.ID.NO.:2, but still encodes the same CYB5RP protein shown as 
SEQ.ID.NO.:3. Such synthetic DNAs are intended to be within the scope of the 
5 present invention. Also with the scope of the present invention are synthetic DNAs 
that encode a CYB5RP protein lacking amino acids 317-328 of SEQ.ID.NO.:3. 

Another aspect of the present invention includes host cells that have 
been engineered to contain and/or express DNA sequences encoding CYB5RP 
protein. Such recombinant host cells can be cultured under suitable conditions to 
10 produce CYB5RP protein. An expression vector containing DNA encoding CYB5RP 
protein can be used for expression of CYB5RP protein in a recombinant host cell. 
Recombinant host cells may be prokaryotic or eukaryotic, including but not limited to, 
bacteria such as E. colU fungal cells such as yeast, mammalian cells including, but not 
limited to, cell lines of human, bovine, porcine, monkey and rodent origin, plant cells 
15 such as tobacco, and insect cells including but not limited to Drosophila and 

silkworm derived cell lines. Cell lines derived fh>m mammalian species which are 
suitable for recombinant expression of CYBSRP protein and which are commercially 
available, include but are not limited to, L cells L-M(TK-) (ATCC CCL 1.3), L cells 
L-M (ATCC CCL 1.2), 293 (ATCC CRL 1573), Raji (ATCC CCL 86), CV-l (ATCC 
20 CCL 70), COS-1 (ATCC CRL 1650), COS-7 (ATCC CRL 1651), CHO-Kl (ATCC 
CCL 61), 3T3 (ATCC CCL 92), NIH/3T3 (ATCC CRL 1658), HeLa (ATCC CCL 2), 
C127I (ATCC CRL 1616). BS-C-1 (ATCC CCL 26) and MRC-5 (ATCC CCL 171). 

A variety of mammalian expression vectors can be used to express 
recombinant CYB5RP in mammalian cells. Commercially available mammalian 
25 expression vectors which are suitable include, but are not limited to, pMClneo 

(Stratagene), pSG5 (Stratagene), pcDNAI and pcDNAIamp, pcDNA3, pcDNA3.1, 
pCR3.1 (Invitrogen), EBO-pSV2.neo (ATCC 37593), pBPV-l(8-2) (ATCC 37110), 
pdBPV.MMTneo(342-12) (ATCC 37224), pRSVgpt (ATCC 37199), pRSVneo 
(ATCC 37198), and pSV2-dhfr (ATCC 37146). Following expression in 
30 recombinant cells, CYB5RP can be purified by conventional techniques to a level that 
is substantially free from other proteins. A description of vectors that can be used to 
express CYB5RP can be found in, e,g., Goeddel, ed., 1990, Meth. Enzymol. vol. 185 
or Perbal, 1988, A Practical Guide to Molecular Cloning , John Wiley and Sons, Inc. 
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The present invention includes CYB5RP protein substantially free 
from other proteins. The amino acid sequence of the full-length GYB5RP protein is 
shown in Figure 3 as SEQ.ID.NO.:3. Thus, the present invention includes CYB5RP 
protein substantially free from other proteins having the amino acid sequence 
SEQ.ID.NO.:3. Also included in the present invention is a CYB5RP protein that is 
produced from an alternatively spliced CYB5RP mRNA where the protein has the 
amino acid sequence of SEQ.ID.NO.:3 with the exception thai amino acids 317-328 
are missing. 

As with many proteins, it is possible to modify many of the amino 
acids of CYBSRP and still retain substantially the same biological activity as the 
original protein. Thus, the present invention includes modified CYBSRP proteins 
which have amino acid deletions, additions, or substitutions but that still retain 
substantially the same biological activity as CYBSRP. It is generally accepted that 
single amino acid substitutions do not usually alter the biological activity of a protem 
(see, e.g.. Molecular Biolopv of the Gene . Watson et al, 1987, Fourth Ed.. The 
Benjamin/Cummings Publishing Co., Inc.. page 226; and Cunningham & Wells. 
1989, Science 244:1081-108S). Accordingly, the present invention includes 
polypeptides where one amino acid substitution has been made in SEQ.ID.NO.:3 
wherein the polypeptides still retain substantially the same biological activity as 
CYBSRP. The present invention also includes polypeptides where two or more 
amino acid substitutions have been made in SEQ.ID.NO.:3 wherein the polypeptides 
still retain substantially the same biological activity as CYBSRP. In particular, the 
present invention includes embocUments where the above-described substitutions are 
conservative substitutions, to particular, the present invention includes embodunents 
where the above-described substitutions do not occur in the His boxes of CYBSRP. 
In particular, the present invention includes embodiments where the above-described 
substitutions do not occur in positions where the amino acid present in those positions 
in CYBSRP is the same as the amino acid present in the corresponding position of the 
sunflower protein depicted in Figure 1 of Sperling et al., 1995, Eur. J. Biochem. 
232:798-80S when these two proteins are aligned by BLASTP analysis. In particular, 
tiie present invention includes embodiments where the above-described substitutions 
do not occur in positions where the amino acid present in those positions in CYBSRP 
is the same as the amino acid present in the corresponding position of the 
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CCCTCTACCCCTGTCCC ATC AGGC (SEQ.ID.NO. : 1 5) 
One of skill in the art would recognize that many other primer pairs 
based upon SEQ.ID.NO.:2 would also be suitable. 

PGR reactions can be carried out with a variety of thennostable 
5 enzymes including but not limited to AmpliTaq, AmpliTaq Gold, or Vent polymerase. 
For AmpliTaq, reactions can be carried out in 10 mM Tris-Cl, pH 8.3, 2.0 mM 
MgCl2, 200 ^iM for each dNTP, 50 mM KCl, 0.2 for each primer, 10 ng of DNA 
template, 0.05 units/^il of AmpliTaq. The reactions are heated at 95**C for 3 minutes 
and then cycled 35 times using the cycling parameters of PS^'C, 20 seconds, 62°C, 20 
10 seconds, 72**C, 3 minutes. In addition to these conditions, a variety of suitable PGR 
protocols can be found in PGR Primer. A Laboratory Manual , edited by G.W. 
DiefFenbach and G.S. Dveksler, 1995, Gold Spring Harbor Laboratory Press; or P£R 
Protocols: A Guide to Methods and Applications. Michael et aL, eds., 1990, 
Academic Press . 

15 A suitable cDNA library from which a clone encoding GYB5RP can 

be isolated would be Human Retina 5 '-stretch cDNA library in lambda gtlO or 
lambda gtl 1 vectors (catalog numbers HLl 143a and HLl 132b, Glontech, Palo Alto, 
GA). The primary clones of such a library can be subdivided into pools with each 
pool containing ^^proximately 20,000 clones and each pool can be amplified 

20 separately. 

By this method, a cDNA fi^gment encoding an open reading jframe of 
either 445 amino acids (SEQ.ID.NO.:3) or an open reading frame of 433 amino acids 
(SEQ.ID.NO.:3 lacking the amino acids at positions 317-328) can be obtained. This 
cDNA fragment can be cloned into a suitable cloning vector or expression vector. For 

25 example, the fragment can be cloned into the mammalian expression vector 

pcDNA3. 1 (Invitrogen, San Diego, GA). GYB5RP protein can then be produced by 
transferring an expression vector encoding GYB5RP or portions thereof into a 
suitable host cell and growing the host cell under appropriate conditions. GYB5RP 
protein can then be isolated by methods well known in the art. 

30 As an alternative to the above-described PGR method, a cDNA clone 

encoding GYB5RP can be isolated from a cDNA library using as a probe 
oligonucleotides specific for GYB5RP and methods well known in the art for 
screening cDNA libraries with oligonucleotide probes. Such methods are described 
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in, Sambrook et al., 1989, Molecular Clo ning: A Laboratory Manual; Cold 
Spring Harbor Laboratory, Cold Spring Harbor, New York; Glover, D.M. (ed.). 1985, 
DNA Cloninp: A Practical Approach . MRL Press, Ltd., Oxford, U.K., Vol. I, H. 
Oligonucleotides that are specific ftwr CYB5RP and that can be used to screen cDNA 

5 libraries can be readily designed based upon the cDNA sequence of CYB5RP shown 
in SEQ.ID.NO.:2 and can be synthesized by methods well-known in the art. 

Genomic clones containing the CYB5RP gene can be obtained fix)m 
commercially available human PAC or BAC libraries available fix)m Research 
Genetics, Huntsville, AL. PAC clones containing the CYB5RP gene (e.g.. PAC 

la clones 759J12, 756B3, 519013, and 466A1 1) are commercially available from 
Research Genetics, Huntsville, AL (Catalog number for individual PAC clones is 
RPCLC). Alternatively, one may prepare genomic libraries, especially in PI artificial 
chromosome vectors, from which genomic clones containing the CYB5RP can be 
isolated, using probes based upon the CYB5RP sequences disclosed herein. Methods 

15 of preparing such libraries are known in the art (loannou et a/.,1994. Nature Genet. 
6:84-89). 

The present invention also provides oligonucleotide probes, based 
upon SEQ.ID.NO.:2 that can be used to determine the level of CYB5RP RNA in a 
sample, hi particular, the present invention includes DNA oligonucleotides 
20 comprising at least 1 8 contiguous nucleotides of SEQ.ID.NO.:2. Also provided by 
the present invention are corresponding RNA oligonucleotides. The DNA or RNA 
oligonucleotide probes can be packaged in kits. 

In addition to the utilities described above, the present invention makes 
possible the recombinant expression of the CYB5RP protein in various cell types. In 
25 particular, it is advantageous to recombinantly express CYB5RP in plant cells. Such 
expression in plant cells provides a method for the production of high levels of 
valuable EFAs such as GLA and OTA in the recombinant plant cells. An example of 
such recombinant expression of a delta 6 fetty acid desaturase, in that case from 
borage, is described in Sayanova et al.. 1997, Proc. Natl. Acad. Sci. USA 94:421 1- 
30 4216 (Sayanova). The recombinant expression of the borage delta 6 desaturase led to 
the production of high levels of GLA and OTA in the leaves of the tobacco plants in 
which it was expressed. The procedures described in Sayanova can be easily adapted 
to express CYB5RP in tobacco, thus providing an additional, useful way to produce 
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large amounts of valuable EFAs. Known methods of recombinant^ expressing genes 
in other plant species beside tobacco can be used to express CYB5RP in those other 
species. 

The present invention also makes possible the development of assays 
5 which measure the biological activity of the CYB5RP protein. Such assays using 
recombinantly expressed CYB5RP protein are especially of interest. 

Assays for CYB5RP protein activity can be used to screen libraries of 
compounds or other sources of compounds to identify compounds that are activators 
or inhibitors of the activity of CYB5RP protein. Such identified compounds can 
10 serve as "leads" for the development of pharmaceuticals that can be used to modulate 
the activity of CYB5RP in patients suffering from conditions where that activity is 
abnormal, e.g., skin diseases, diabetic complications, inflammatory and autoimmune 
disorders, cardiovascular disorders, complications of viral infection, and retinal 
dysfunction such as macular degeneration. 
1 s Such assays may comprise: 

(a) recombinantiy expressing CYB5RP protein in a host cell; 

(b) measuring the biological activity of the recombinantly 
expressed CYB5RP protein in the presence and in the absence of a substance 
suspected of being an activator or an inhibitor of CYB5RP protein; 

20 where a change in the biological activity of the recombinantly 

expressed CYB5RP protein in the presence as compared to the absence of the 
substance indicates that the substance is an activator or an inhibitor of CYB5RP 
protein. 

In particular embodiments, the biological activity of the recombinantly 
25 expressed CYB5RP protein is the ability to introduce a double bond into the 6 
position of linoleic acid or alpha-linoleic acid. 

In some embodiments, it may be advantageous to insert additional 
steps between steps (a) and G>). Such additional steps might include lysing the host 
cell and fractionating its contents in order to partially purify the recombinantly 
30 expressed CYB5RP, thus facilitating exposure of the recombinantly expressed 
CYB5RP to the substance as well as to any substrate used in the assay. 

The present invention includes activators and inhibitors identified by 
the methods described herein as well as pharmaceutical compositions comprising 
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such activators and inhibitors. The activators and inhibitors are generally combined 
with pharmaceutically acceptable carriers before use to form pharmaceutical 
compositions. Examples of such cairiers and methods of formulation of 
phaimaceutical compositions containing activators or inhibitors and carriers can be 
5 found in Remington's Pharmaceutical Sciences. To form a pharmaceutically 

acceptable composition suitable for effective administtation, such compositions will 
contain an effective amount of the activator or inhibitor. 

Therapeutic or prophylactic compositions are administered to an 
individual in amounts sufficient to treat or prevent conditions where CYB5RP activity 
1 0 is abnormal. The effective amount can vary according to a variety of factors such as 
the individual's condition, wei^t, sex and age. Other factors include the mode of 
administration. The appropriate amount can be determined by a skilled physician. 

Compositions can be used alone at appropriate dosages. Alternatively, 
co-administration or sequential administration of other agents can be desirable. 
1 5 The compositions can be administered in a wide variety of therapeutic 

dosage forms in conventional vehicles for administration. For example, the 
compositions can be administered in such oral dosage forms as tablets, capsules (each 
including timed release and sustained release formulations), pills, powders, granules, 
elixirs, tinctures, solutions, suspensions, syrups and emulsions, or by injection. 
20 Likewise, they can also be administered in intravenous (both bolus and infusion), 
intraperitoneal, subcutaneous, topical with or without occlusion, or intramuscular 
form, all using forms well known to those of ordinary skill in the pharmaceutical arts. 

Advantageously, compositions can be administered in a single daily 
dose, or the total daily dosage can be administered in divided doses of two, three or 
25 four times daily. Furthermore, compositions can be administered in intranasal forni 
via topical use of suitable intranasal vehicles, or via transdermal routes, using those 
forms of transdermal skin patches well known to those of ordinary skill in that art. To 
be administered in the form of a transdeamal delivery system, the dosage 
administration will, of course, be coittinuous rather than intermittent throughout the 

30 dosage regimen. 

The dosage regimen utilizing the compositions is selected in 
accordance with a variety of factors including type, species, age, weight, sex and 
medical condition of the patient; the severity of the condition to be treated; the route 
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of administration; the renal, hepatic and cardiovascular function of the patient; and 
the particular composition thereof employed. A physician or veterinarian of ordinary 
skill can readily detemine and prescribe the effective junount of the composition 
required to prevent, counter or arrest the progress of the condition. Optimal precision 
5 in achieving concentrations of composition within the range that yields efficacy 
without toxicity requires a regimen based on the kinetics of the composition's 
availability to target sites. This involves a consideration of the distribution, 
equilibrium, and elimination of a composition. 

The present invention also includes antibodies to the CYB5RP protein. 
10 Such antibodies may be polyclonal antibodies or monoclonal antibodies. The 

antibodies of the present invention are raised against the entire CYB5RP protein or 
against suitable antigenic fragments of the protein that are coupled to suitable carriers, 
e,g.y serum albumin or keyhole limpet hemocyanin, by methods well known in the art. 
Methods of identifying suitable antigenic fragments of a protein are known in the art. 
15 See, e.g.. Hopp & Woods, 1981, Proc. Natl. Acad. Sci. USA 78:3824-3828; and 

Jameson & Wolf, 1988, CABIOS (Computer Applications in the Biosciences) 4:181- 
186. 

For the production of polyclonal antibodies, CYB5RP protein or an 
antigenic fragment, coupled to a suitable carrier, is injected on a periodic basis into an 

20 appropriate non-human host animal such as, e.g., rabbits, sheep, goats, rats, mice. 
The animals are bled periodically and sera obtained are tested for the presence of 
antibodies to the injected antigen. The injections can be intramuscular, 
intraperitoneal, subcutaneous, and the like, and can be accompanied with adjuvant. 

For the production of monoclonal antibodies, CYB5RP protein or an 

25 antigenic fragment, coupled to a suitable carrier, is injected into an appropriate non- 
human host animal as above for the production of polyclonal antibodies. In the case 
of monoclonal antibodies, the animal is generally a mouse. The animal's spleen cells 
arc then immortalized, oflen by fusion with a myeloma cell, as described in Kohler & 
Milstein, 1975, Nature 256:495-497. For a fuller description of the production of 

30 monoclonal antibodies, see Antibodies: A Laborator v Manual, Harlow & Lane, eds.. 
Cold Spring Harbor Laboratory Press, 1 988. 

Gene therapy may be used to introduce CYB5RP polypeptides into the 
cells of target organs, e.g., the pigmented epithelium of the retina or other parts of the 
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retina. Nucleotides encoding CYB5RP polypeptides can be ligated into viral vectors 
which mediate transfer of the nucleotides by infection of recipient cells. Suitable 
viral vectors include retrovirus, adenovirus, adeno-associated virus, herpes virus, 

vaccinia virus, and poUo virus based vectors. Alternatively, nucleotides encoding 
5 CYB5RP polypeptides can be transferred into cells for gene therapy by non-viral 
techniques including receptor-mediated targeted transfer using Ugand-nucleotide 
conjugates, lipofection, membrane fusion, or direct microinjection. These procedures 
and variations thereof are suitable for ex vivo as v^ell as in vivo gene therapy. Gene 
therapy with CYB5RP polypeptides will be particularly useful for the treatment of 
10 diseases where it is beneficial to elevate CYB5RP activity. 

The present invaition is not to be limited in scope by the specific 
embodiments described herein. Indeed, various modifications of the invention in 
addition to those described herein will become apparent to those skilled in the art 
fi-om the foregoing description. Such modifications are intended to fall wiflrin the 

1 5 scope of the appended claims. 

Various publications are cited herein, the disclosures of which are 

incorporated by reference in their entireties. 
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WHAT IS CLAIMED: 

1 . A recombinant DNA molecule encoding a polypeptide having 
the amino acid sequence of SEQ.ID.NO.:3. 

5 

2. A recombinant DNA molecule comprising a nucleotide 
sequence selected from the group consisting of: 

SEQ.ID.no.: 1; 
SEQ.ID.NO.:2; 

10 SEQ.ID.N0.:2 lacking positions 1,019-1,054; 

positions 71-1.405 of SEQ.ID.NO.:2; and 

positions 71-1,405 of SEQ.ID.NO.:2 lacking positions 1,019-1,054. 

3. A DNA molecule that hybridizes under stringent conditions to 
15 the DNA molecule of claim 2. 

4. An expression vector comprising the DNA of 

claim 1. 

20 5. A recombinant host cell comprising the DNA of claim 1 . 



6. A CYB5RP protein, substantially free from other proteins, 
having an amino acid sequence selected from the group consisting of SEQ.ID.NO.:3 
and SEQ.ID.NO.:3 lacking positions 317-328. 

25 

7. The CYB5RP protein of claim 6 containing a single amino acid 

substitution. 

8. The CYB5RP protein of claim 7 where the substitution is a 
30 conservative substitution. 



9. The CYB5RP protein of claim 6 containing amino acid 
substitutions where the substitutions do not occur in positions where the amino acid 
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present in CYB5RP at those positions is also present in the corresponding position in 
the delta 6 desaturase from sunflower when CYB5RP and the delta 6 desaturase fix>m 
sunflower are aligned by BLASTP analysis or where the substitutions do not occur in 
positions where the amino acid present in CYB5RP at those positions is also present 

5 in the corresponding position in the delta 6 desaturase from Synechocystis when 
CYB5RP and the delta 6 desaturase from Synechocystis are aligned by BLASTP 
analysis or where the substitutions do not occur in positions where the amino acid 
present in CYB5RP at those positions is also present in the corresponding position in 
the delta 6 desaturase from borage when CYB5RP and the deha 6 desaturase from 

1 0 borage are aligned by BLASTP analysis. 

1 0. An antibody that binds specifically to the CYB5RP protein of 

claim 6. 

15 1 1 . A DNA or RNA oligonucleotide probe comprising at least 1 8 

contiguous nucleotides of at least one of the sequences of claim 2. 

12. A method for detennining whether a substance is an activator 
or an inhibitor of C YB5RP protein comprising: 
20 (a) recombinantly expressing the CYB5RP protein of claim 6 in a 

host cell; 

(b) measuring the biological activity of the recombinantly 
expressed CYB5RP protein in the presence and in the absence of a substance 
suspected of being an activator or an inhibitor of CYB5RP protein; 
25 where a change in the biological activity of the recombinantly 

expressed CYB5RP protein in the presence as compared to the absence of the 
substance indicates that the substance is an activator or an inhibitor of CYB5RP 
protein. 



30 



13. The method of claim 1 2 where the biological activity of 
CYB5RP protein is the ability to introduce a double bond into the 6 position of 
linoleic acid. 
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14. A pharmaceutical composition comprising an activator or an 
inhibitor of GYB5RP. 

15. A method oftreating macular degeneration comprising 

5 administering to a patient an effective amount of the pharmaceutical composition of 
claim 14. 
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gctcacagac 
tgcgggacgc 
gtggagtcaa 
gtgccccggg 
ggccgcctcg 
cgcctggatt 
ctggccgggg 
at tataaa ac 
ctcgtacggc 
cggaggcggc 



cgggactccg 
ccaacaggtg 
gagcctggaa 
gcagggctgg 
acgccgccct 
ggagcggacg 
gcgggggggc 

ggggagttcc 
ggccgcggcg 
gcccgggagc 



2/19 

cctccggttc 
cgtgttgtgt 
gccggcagcc 
gtggcggccg 
ccctggcggc 
cgggggtcag 
aggcgaggcg 

ctgcgccgcg 
gcagggcggg 
gctCTTCGCT 



ccgagggcgt 
ccccaggccc 
cgggaaaagg 
ctgtcctccc 
caatggagac 
ccagccttgg 
aggcgggcgc 
agccgggagg 
gccggagcag 

TCCCTCGGGG 



ggcgaggcgc 
cgcgctccgg 
gggcgggacg 
gggaggggcg 
cgaggccccg 
gggccggggc 
cgtccgcgcg 
cgcacgctcg 
cgggcggcgg 
TCTTCCTCgfi 



ACCI^GGCCA rracr-[r^.G A TCCCCAGGhC TCGTGCGTGC AgCATGgGCG 



aCGTCGGGGA GCCGGGAC CG CGGGAGGGAC CCGCGCAGCC GGGGgCGCCG 
CTGCCCACCT TCTGCTGGGA GCAGA TCCGC GCGCACGACC AQCCCQGCGh 
GTCATCGA GC GCCGCGrCTA CGAChTChQC CGCTGGGCAC 



PAAGiTGGCTG 



701 AGCGGPArrr AGG G carAGC CGcrTTATCG GCCACCACGg CGCTGAGGAC 



751 GCCACGgtaa 

801 gagctcggtc 

851 cctccctccc 

901 ggtggactag 

951 ggaggacccg 

1001 tcagaggtgg 

1051 cctgttgctc 

1101 ttgctgagtg 

1151 gacatgccat 

1201 cccaagagct 

1251 tagcagctgc 

1301 ctggaagcac 

1351 tcccaatccc 

1401 ggtactgtct 

1451 tccttagtta 

1501 cttgaccctc 

1551 tcaagacccg 

1601 gaggttctgt 

1651 ctgctctctg 

1701 ctctttgagg 

1751 ggagtgagaa 

1801 aaagagaacg 

1851 gggacagcac 

1901 gcaggtgttt 

1951 ccagggcgtg 

2 001 agccctggcc 

2051 tttaaaaaag 

2101 agggtctcac 

2151 ttcctcctgc 

2201 cactcccggc 

2251 agaaaacttt 

2301 cccagactgg 

2351 cccaggttca 

2401 cgggcacgcg 

2451 tttcaccatg 

2501 ccacctcggc 

2551 cggctgggat 



ggaagccata 
gtgggcgtga 
ctcgctgacc 
ccagggccag 
cagctgtcca 
gtgtgtcatg 
ccaggtccct 
ctaggggtag 
tagaggctgg 
tctgtaaagg 
acgtgggagg 
agtcacccca 
agcccctgtc 
ggctggagtc 
cccacctgca 
caaatttcta 
aggacatgaa 
ccagtcccag 
ccgattgcca 
aagaaagccc 
taagaatcct 
ccttgtccgt 
agccgtggga 
gcatgtctgg 
gcgaggtgag 
tcccttcaag 
tatctatttt 
tatgttgctc 
ctcagcctcc 
ctgctctagt 
actccccacc 
agtgcaatgg 
agcgattctc 
ccaccacgcc 
ttggccaagc 
ctcccaaagt 
acagaaagct 



aggaagccac 
tgtcccgctc 
tttgacctcc 
atgtggggta 
cggagcaggt 
ctgcagagcc 
gtttcagttc 
ggcagggcag 
gggctgggcc 
gctcagggac 
gctttgccag 
ggaacaggct 
tagacaggca 
cagtggtgga 
taataggggt 
ggttggccac 
tcctgaatgc 
gactgtcggc 
tctccagcat 
ctcttttccc 
cctgaaattc 
ggctgttcag 
tgaagcagcc 
gtgagtgtgg 
gggcacggct 
gagtcttgtg 
atttattatt 
gggctggtct 
gaaagttctg 
cttttgtaac 
aaccgccgga 
cgccatcttg 
ctgcctcagc 
cagcatattg 
tggtctcgaa 
gctgggatta 
tttatttcat 



ccaccggcgg 
cacctgtggg 
acgccgggac 
gggagggcag 
ctgcggggga 
tgccctgggt 
tgggtcccca 
ggtccccagg 
ggcctgaggt 
agtgactcac 
ccaggctggg 
ggcccctggg 
gggatgtagc 
gcagcccgac 
tggggccacg 
actgggtatc 
tggctttttg 
gtccctcttg 
gttggacaat 
tttccacccc 
taaaaaaaga 
gcgccagacg 
tgggggcagt 
tgtgtgtgcc 
tctccccaaa 
gatgcctgct 
atttgtttaa 
caaagtcctg 
ggattacagg 
ctagaggaca 
gacagagtct 
gctcactgca 
ctcccgagta 
tatttttagt 
ctcctgacct 
caggcgtgag 
cactgtttcc 



gtggagcctg 
gccttagcat 
ccagagttgg 
ttccctgcgt 
ggagggggcc 
gaggggctgc 
tgctgggtgc 
ggccggtaag 
ctgtggcttt 
ctctccgggc 
tgggcctctc 
gaccccaact 
ctggccccag 
cagccccttt 
atgccctgtc 
aggaaggtct 
ggcagcagcg 
ccagggccac 
cttcactgga 
atgaagctga 
aaaaaaaaaa 
ctggcccgag 
atttgagcgt 
tgcctttctg 
ggccttgctg 
ctggtctttt 
aaatagagac 
ggttcaagca 
catgagccac 
gtatggatac 
tgctctgcca 
acctccgcct 
gctgggatta 
agagacgggg 
cgtgatccac 
ccaccacgcc 
tgcctggtgc 



FIG.2A 
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caggcccatg 
gcttgggatc 
caataatatc 
ctgtaatccc 
gggagtttga 
aaaaatacaa 
ctttgggagg 
ccggtgagcc 
aaactccgtc 
ccgggtgcca 
tgtcactcct 
ggcagacagt 
gcccagggct 
taagtgtgag 
acggctaccc 
gtaggaaagg 
gcctggacta 
tgtgaggttg 
agggtagacc 
ggccccaggc 
ccagttgcca 
cccatcagct 
tcccagagtt 
aggccgggtg 
gagagtcata 
gcccagccgc 
actatgtgcc 
tttaatcctc 
cacttgacag 
ttcagtgatt 
gctatgctgc 
tgtttgactc 
gttctcacct 
gttcacagca 
cgtcctgtgt 
gagggcctgg 
gtttgtgtcc 
cccagagagc 
gatccttaat 
gggaggatgg 
gagtgtggcc 
gtccagtgag 
gagagggagc 
cttctgggtc 
gtcctgggca 
catgggcaac 
caactcaggg 
ctgaagccct 
gccctctgcc 
agctggagga 
ttgcagccaa 
gtaactgatt 
gcccattttg 
aaccttctca 



ctggggttcc 
ctgagacttc 
tgttttaaaa 
agcactttgg 
gaccagcctg 
aattagccag 
ctgaggcagg 
gagatcgcgc 
tcaaacaaac 
gctctgctat 
ctgtgcccca 
gagctggctg 
gccaggtagt 
cgtgccctgc 
ctttcctggg 
gacactttag 
gagttgccag 
ctgggatggt 
aggggttggg 
atcccatccc 
gggatgtgtt 
gttctcaagt 
tctgtgccat 
ccctggggag 
ggcagcctga 
tgctggccat 
aggttttgaa 
aaggtgcccc 
atgagaaaac 
ggtctgggat 
cacctatcct 
attaaatcca 
ctgaagatgt 
gagggsitcag 
tccaggaata 
cagtgaggtc 
tcggaaggac 
acagcaggga 
ggagagagtg 
agacctgccc 
agggcagaga 
agccattaga 
cagcctggcc 
tcagtttccc 
cactatgagg 
aaagctctat 
cggatgagga 
aacaactgtg 
ttaggaccct 
gggagtggcc 
cagttattga 
gaattcgcct 
tagatgagga 
ggacccttga 
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tcccaagtgg 
catcacacag 
acatctcagg 
gaggctgagg 
accaacatgg 
gcgtggtggc 
agaatcgctt 
cattgcactc 
aaacaaaaaa 
tggaggcact 
gtttactcat 
agggcaggac 
agtttgtatt 
aaactgcagc 
tgaccttatt 
gctgtctctt 
aaatacttgg 
gcaatcagtc 
ccagcccagt 
gggcggtggc 
cc^gcggtca 
ccaggcaatg 
gaagtcagcc 
agtactctgg 
tactagtgga 
aagtatataa 
atcagtactt 
tatgaggcac 
agaggctcag 
ttgaatccac 
gtttatttcc 
tcagtgagca 
agctgtgagc 
ctagagcaaa 
cagtatggct 
tagaggcggc 
tttggcttta 
agcctgggga 
aaggcagagc 
tcccccaagg 
gcaagagagg 
cacattagat 
tcgctctatg 
cattagtgtg 
gtggtgctgg 
tcatgggtgt 
gtttcccagc 
ggaaaatcca 
ccttctaggt 
agtgctgcag 
ctaggcactg 
aacaacttta 
gactgagttt 
agggtagggc 



aattactgac ttaacattta 
ttttctcatt gattcgcagc 
ccgagcgctg tggctcacac 
tgggcagatc acctgaggtc 
agaaaccctg tctcttctaa 
gcatgcctgt aatcccagca , 
ga^cccagga gacggaggtt 
cagcctgggc aacaagagca 
catctctctg ctccttgggg 
gagcgacctt gaagcaggca 
ctgtaaagtg ggagagctgg 
tgtgtctcct caagcccatg 
cggtaaatgc tgctggcccc 
gtatggtggg acagccctgc 
tggttacggt cctatctgaa 
agctccctca aggccccaca 
tccattcagg ccaaagggac 
tttgtccatg atgaacccac 
gccctgtgta gttgagccca 
ctcaggtgga ggtggggcag 
cctctcacca gccccggctg 
aagccttcct gccaggaaat 
tgtggccatc ttgggacaca 
gcccttggcc aggtttgtct 
gccagccagg gagggatgag 
gggccatgtg ctgagtgcct 
gatttattga aaccctctct 
gtaccattta ttgttattgc 
agaggcaaag tggcttgaaa 
agccatgttc ttaagggcat 
ggcactcatt gattcttcaa 
tcttctctgt gtcatgcatg 
aaaacttcta cagggaatga 
ggctcagagg tgggaccgtg 
gcagcagaga gcagtggaga 
cgggctggct catgctggat 
ttttaaagag gatggggagc 
gtctgatgga catttaaaag 
cttccagaag ggtaagagaa 
gaggccactc agaagaggta 
ctgtggacac aggcacactg 
ttagcttcat gttgtcttta 
atcttggaca catcctttca 
atgaggatga gaatgctttt 
gcacctgggt gcctggttac 
ggtgaatgca ttgcccacag 
agcccctggt gccctttcgg 
agttccagca gaccccctga 
ggttctctga gcctggcctg 
cagaggctgc ttcatagtaa 
ttctgagggg tttagatgtg 
tgaggtaagt cctattgtta 
gaaactgggg ggtgtaatgg 
ctttgtactc gggccacgag 
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5301 ggtggggttt gtgtctgggt gggagctggg gagggacagg actaggatta 
5351 ggcagatctg aggccacagg agttggttgg ggggtggctc cagagccact 
5401 ccactccctc ctaccacatt gactgccttg aaagtcccct aatggccact 
5451 cccatgaagt gtgactgctc tgggctcccc gcaggcgttfc tctgcaaggc 
5501 caccgcccac ccaggcccct tccccagagg ggctgcagtg ccttgctcct 
5551 tccttgtggg aagagttggg attgtctggc gtcagcagga tactgcccct 
5601 gggcatccct cccggtctct tcctgcgggt ttctgatgaa acagccaggc 
5651 tccagtagtg gagccagagg tcagtggtgg agagaggacc aggagccaga 
5701 gggtatagct gctttggggc tactgtgggg tcagggacac ttgtgaggcc 
5751 aagcgtcctg gctgcaggag ccctcacata tatgcccacc cttcaccagg 
5801 acattgaggg gtgctggggg acaggggtag ctttttgggg gtgtctgcct 
5851 tcgacttggg ctccgctaca caggccaaat ttggatgtcc catgtttaga 
5901 gctgtgtttc tttgggacct cttggggcct cagtttcctc atctgtaaaa 
5951 tgggatactg atagtgcttc cccactggcc tcctctgacg ggcgccaggg 
6001 agaggatggg acggagcatg gtgtgctggg cacgctcctg ctgtacccac 
6051 ccacctggga gaggggagag gcaggaatgt cctgggggtg tcctttgagg 
6101 catagccctg tcaccccaac atcctacaaa ggcatgagaa ggcagcgagg 
6151 acagaccccg accacctgag ccctcagcag ccctgccaca ctccctgctt 
6201 cacccccttc ctgactgatc tggcacattc ttgattctcc tagggagtga 
6251 cccaaaatcc ctccctgccc tgctgtgtct ctggggtgga aggaggctgc 
6301 cagcccctcc tctctcccag cctcaggctt ggccaggact taacaggcag 
6351 gcagagaagc agcttctcca ctctcttccc tgacacctgt aggcccctcc 
6401 tgcaggcact tacctctaag tggactctca ggaggaggct catcagggct 
6451 gcagggctca gaaagagctg ggctgtggag ctcttgccaa ccgccaggcc 
6501 ccttctaagt gctttagcgc caccgactgc atcctcccag cagccttgtg 
6551 agatggggat ttgtggttcc cagtttactg atgagaaata ctgatgagag 
6601 atgggtgtgg tcttgtctgg ggctccctgg ctcctggata gcagctcagg 
6651 ttccatcctg ggcaggctgg ctctgggaca cccccccgac cagctgctgt 
6701 gtgggattca cggtggggct tgggcagggc gtgggatctt ggggccaact 
6751 gagccactct aggcttccag ggaccaaggc caggctgagc tgtctctgta 
6801 tcctgagaga gcatgaacat cacagaagat gggcccgggt tcgaatccca 
6851 gctctgccac tactaactgg gacctgggca ggggtccctt cccgctgagc 
6901 cttcatttcc tcaccagcaa aatggttcgt gcccctgctt tgggggctgt 
6951 ggagggttgg ctcttgtcta cttgttcata cctgctgttg agcagctgct 
7001 ctgtgccggc ctctgaggat gccactgtga acagagcctg tcgctacctc 
7051 caggagcttg tgtttagggg tgccgttttg attccagcac tttcacccag 
7101 ctctgctccg gtacccgatg agagacgtcg agtgccgctt tccactcgct 
7151 tgggtgcgtg tgggggttgg ggggacaggg ctttgtgcac gtagccctgg 
72 01 gtggatgttc ctgggtgcac ttagggtgtg tgagggtggg acctcccaca 
7251 gttccctgag gctccactga tgaggtccaa gaaccgcctt cctgcccccc 
7301 agcccaggct cccagcagct gggcccttgg cttcttgaga tagtgactgg 
7351 cctcacggca aggacccccg cacaccacct aggagaactg ctgcttcccc 
7401 tctgttccag gagtggcgac aagcacagtt tttcgctttt gtttttgttt 
7451 tcttcacttt aagttccggg aaacgtgcag aatgtgcagg tttgttacat 
7501 aggtatacat gtgccatggt ggtttgctgc acccgtcaac ccctcatcta 
7551 ggttttaagc tccatataca ttaggcattt gtcctaatgc tctccctccc 
7601 cttgcccctc acccgcccag taagccccgg tgtgtgatgt tcccttccct 
7651 gtgtccatgt gttctcattg ttcaactctc acttatgagt gagaagagac 
7701 ctggactctg atctaacctc ggtcaaatgg aactgtgtga ccttgaagaa 
7751 gtagcttaac ctctctgagt cttagcttct gcctggcacc cccatcctta 
7801 aggagaggcc cacagaggac caggtcacat gacctcagcc agttccagag 
7851 aaggctgttt gcttccaggt ttcggcctga gtccaggccc ctgccctact 
7901 cgcactccct gatagcatga gaagcacagc cccagggtgc ccacccagct 
7951 ctgagagccc agcctgcttc ccagggaact gtcacagccc cacctgtccc 
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ttccccagct 
tgagggggct 
aaggccctgg 
gggctggcct 
atctggaccc 
gggtgataga 
cagcttgcac 
cgtaccatgt 
attgtgggca 
gtgggtgtgt 
gagggactca 
acacgtgctg 
aagccagagg 
ggaggaggca 
agggggagga 
ttggggaagg 
acactgtgca 
caatggtaaa 
ggcgtggtgg 
gaggattgct 
aaccccgqict 
acctctaatc 
ctgggaggtg 
ctgggcgaca 
aattagaggc 
ctgaggtggg 
acatag;tgag 
ggtggcatgt 
tcacttacga 
gcactccagc 
gtgtttttct 
tgggaggctg 
tgggcaacat 
gctgggcatg 
gtgggaggat 
tgcaccactg 
aaaaaaaaaa 
aagagtcttc 
ggtagctggg 
cccttttttc 
gggggtgaga 
aagtcagccc 
ccaagctgcc 
atgtaggtgt 
tcccccaggg 
tcacttccgt 
aataggaggc 
gcaactggga 
ctactggttc 
agcctctgaa 
agtgagccga 
gagcccaggt 
ctggtctctt 
cacccccttt 



ggagccctgt 
cacacttccc 
ggcaggcttg 
gggtgaatca 
aagatcctca 
gggaagggct 
gttgcgccgg 
gtgcaaggct 
tgtttctgag 
ctgcatgtgc 
ccatcacgct 
caacatggat 
caaaggaaca 
gatctgtatg 
gagaatggag 
tgaaaaggtt 
tgcacttaat 
tttcatgtat 
cttatgcctg 
tgagctcagg 
ctacgaaata 
ccacctactc 
gaggttgcag 
gagcaagact 
caggtgtggc 
aggatcgctt 
accttgactc 
gcctgtagtc 
ccaggatttc 
ctggtgacag 
gggctgggcg 
aggtgggtgg 
ggcaaacctc 
gtggtgcaca 
cacctgagcc 
cactctagcc 
aaaaaaaaaa 
ttccctccca 
ggcccagaca 
taccagtttt 
tctgcactta 
aggttattcc 
ctgggcctat 
tacctcttag 
tggtaggaga 
tttaagactc 
ccctgtcctc 
ggcctccctt 
tataagctgc 
agtacaagga 
ttacaatacc 
gtgaagttca 
tccctggccc 
ctttactctt 
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caatggcttt ggggttctct gacacagccc 
cttatcattg caaggggtag atctggcttg 
gttctgtcct cccctgtcag tgcctcgaca 
ggaccaacgg gaaaggaggc gaggagacca 
gctcaataag gtggccccag aactgacatg 
gggagggagg agattctggg gccgcagcca 
gtgtgtctgt gcgtgccagc tgcatctttg 
gtgtttggct gagtgttcat gtgggccgtg 
tgtctgagtg atgcctgctg gtgtgggctg 
gtgtgtgtct ggggagtttc aaaggagaaa 
ggctcagcct taaaaaggta ggacatcctg 
ggaccttaag gacattgtgc tgagtgaaac 
aacatgtgat ttctcccaga tgaggtttcc 
gacagaaggt agcatggtgg ttgccggggc 
aattagtgtt taatggggac agagtttcag 
ctggagctgg atgatggtga tggttggaca 
accactgagc tggacaccta aaaatgctta 
attttactac aatttttaaa aaattggctg 
taatcccaac actttgggag gccaaggcgg 
agttcaacac cagcctgggc aatatggtga 
tacaaaaatt agcctggtgt ggtggcttgc 
agtaggctaa ggcacaagaa tctcttgaac 
taagccgaga tcatgccact gcaacccagt 
ctgtctcaaa aaataaaaga taaataaaaa 
tcacacctgt actctcaaca ctttgggagg 
gaagtcaggc atttaagaca tgcctaggca 
tacaaaaaaa ttcaaaagtt aatgagacat 
ct^gctgctg gggaggctga ggtgggagga 
aaggctgcag tgagctgtga ttgcatcact 
agtgaggccc tgtctcaaaa aaatttttca 
tggtggctca ttcctgtaat tccagcactt 
attgcttgag cccaggagtt taagaccagc 
atctctacaa aaaataaaaa taaaaaatta 
cctgtactaa cagctacgag agaggctaag 
cgggaggttg aggctgcagt gagccatgat 
tgggcgatac agcaagaccc tatctcaaaa 
aaaaacaccc agtggggtca gtagaacccc 
gctcccctgt acaccagccc cagctctgca 
gcttcctggg gacccccagc cttccctctg 
gctgcccctc cttcaagact catgtccaga 
tacagccccc tcctctgtaa tgagtgagcc 
agaaggggca ccctaccagc cccccagtcc 
aaaagcaggc aaggggaccc ctagtagatc 
tgggtgctgg aggggcctga agtgctttct 
atgtcctggc agtgacttca gggcccgctg 
accagctggt aggctcatta gcaagaggac 
agtcagcttt cttcaaaggt gtttccttta 
ctccagaccc atggggacaa caccacccag 
tgtatggctc tggctagccc attcagagaa 
aaaaaatcag tccaagagct gtgaacaatt 
aagaccacag gcagacctgg aaggctaagt 
agcttacttt acttctgggc cacttcctgg 
ttatctttct cctggtctgt cttctcttct 
tcttccttct cctgcatcgt actccacccc 
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cactccagct 

ttatgatgtt 

cccaggctgg 

tcttacagtg 

atttacttta 

ttccagcact 

taaagaccag 

ggaggctgag 

gaactgagat 

tctcaaaaaa 

agtcaaatca 

ctcaatgtgt 

cctcagttcc 

aaaaacgggg 

gtggaaactt 

tggttcaaga 

ctgcagatgt 

ggatggggca 

gggaggacca 

ttcctgtccg 

ctgtctctgc 

ctgactgtct 

TTAATTTTCT 
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attacacaga atcgcgagaa 
ttcttttttg taaaaataga 
tcttgaactc ctggcctcaa 
ctgggattac agatgtgagc 
aaaaaaaaat taggctgggc 
ttgggaggcc aaggtgggca 
cttggccacc tggggtcagg 
accggagaat tgcttgaacc 
catgccattg catgccagcc 
aaaaaaaatt atgttttgtg 
gtttaactgt tcaagtgtct 
gtcgcccttg actgatcccc 
aggttttccc acctaccctt 
taaatcaaat gttcgtgacc 
gtatgactta agctttttgg 
aatcaaagtc ttcccgggag 
ttgaccgtgt tcagagaggg 
cagcaggcaa tgggtgaaaa 
gggagggccc atgtctttga 
cctgtcgtct gctctgccca 
tggttgccgc tgtgctactc 
gctctccttc ag^ftTCCCTT 



tgttggatta 

gacaaggtct 

gcaatcctcg 

caccatgcct 

gcggtggctc 

gatcaactga 

agtttgagac 

caggaggtag 

tgggcaacag 

ctcctgcttc 

tccttgcaaa 

ccgccccgtg 

cacccactgc 

cagatcttat 

aaaagcagaa 

gtctttctgt 

gcccttgtgc 

gcaggacaac 

ctgttcatca 

tccatccgta 

agctgtgtct 

rraTGCCTTC 



ttcattttat 

cactatgtgg 

tgccttggcc 

ggcccatttt 

acacctataa 

ggtcaggagt 

cagctactcc 

aggttgcaat 

agcaagactg 

ctgctttgta 

cccccaagga 

acccagtggt 

ttatgtttat 

tctacatgca 

ccttttttcg 

aaatccagag 

tgggtgaagt 

ctggggccct 

gccggctgac 

gtccttccgc 

gtctgtccgc 

r^TP AAGATC 



nasi CCG CAACAAC CCAGCCAGGA TCGACCCCTG 



TYZ^ATTPGG AGAGCTGGCT 



11901 
11951 
12001 
12051 
12101 
12151 
12201 
12251 
12301 
12351 
12401 
12451 
12501 
12551 



gagaggctca 
tgccacatgg 
agccagggct 
gaagctgtag 
caggcactcg 
tgtccccatc 
agcgtcttac 
cagccagtgc 
tcctctttgc 



gcccctgagg gagggggatg 
ccaggagcag ctccctcggc 
gagcctgccc tcccctccca 
ggatgccctg agaagtccag 
tttggatccc gaggcaagct 
aaaaggagga ttttgatgaa 
ccaccccata ccttttggga 
tccagctcac accccgggct 
ccacacccct tgggcctggc 
atgggggtgg ggaggaattt 

ncr^a GCGCAG PTCGTCGAGG 



ctccaggaga 

r.r ApjcGAGG ArATCjG^GCT GTTTT.ATGCr AGTCCCACC^ II I^Z^^ ™ 



ftAT atoaqcc 
gctggagggc 
attcgcccaa 
gggggcaggc 
ggctccagat 
ccctccctgt 
ctgatttctc 
gggagaggag 
gggtactctt 
gatgggagga 
cttccttggc 
^PTTCCGAGC 



agagccctag 
tgggagacat 
ggggatgcag 
agttgaaagt 
ctggtttagc 
tgtcgcccag 
tcctggctgt 
gcttcaccac 
gtcacttcat 
gcggctgggg 
tgatcggccc 

rrTGTAGCAG 



rrTftCTcr^^C Cfir^TrrTGG CC^TGqfiqqT ^^rTGGrrT^^ ^^TS^rS^S 
T^rcTCCTGGr. TrrTGGCTGG GTGCCCA P^ crrTGGrcGr CTTTATCCTG 



^7f.01 GCr ATCTCTC 



12651 
12701 
12751 
12801 
12851 
12901 
12951 
13001 
13051 
13101 



aacagacgtg 
tgaatgaatg 
agtgaacagt 
ccagatgagc 
gccactgagc 
cctccacctc 
ccacaagggt 
aaaggattca 
gcttctgacc 

GArCTGGGCC 



AGataacccc 
ggcccccatg 
gacctatgaa 
ctgaaggccc 
caagaagttc 
cagaggctgc 
cctgggactc 
cttgaaatgg 
ctggggcccc 
ctgcctctct 

ATGCCTCCAT 



agttctgtgt 
catctgggca 
aggatgaatg 
atcaggcatg 
cttcttgaac 
tgccctgcag 
agttctcatc 
agcattagca 
aggccctggc 
ccccagG£3!Q 

CTTCAAGAAG 



tgcagccacc 
ttgtgaacat 
gatgaat:aaa 
tctgtgggtc 
agattccgat 
cttcatgaca 
tgtaaaaaga 
cgggggtacc 
gggctccgtc 
^prrrTGGTG 



ttaactgccc 
atttgctaaa 
cagatgaatg 
aagctgcatt 
caagcacagg 
cttacgagcc 
ggacactggc 
ctgcaagctg 
cttcccaaca 

TTTGCAGCAT 



rrrrrvnaTGG7\ ArrArGTGGC 



13151 CC AGAAGTTC GTGATGGGGC 



13201 
13251 



gccaggtgct 
gggcacagcc 



gggtggcgct 
tgccctgaga 



AGCTAAAG ot, 
gggtctgccc 
gccccctcct 



gagggtgggg 
aagtgtgtgg 
cctccacagfi 



tgggtggtca 
gcacagtcgg 
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133QJ- 



rrAPTCCiTCG: A&rTTrrr; r r APTrrrAGrA CCACGCCfiArT ccCftACJiTCT 

fipr^nTY^nrrtr crGTCTTCCT CCTGGGGGAG 



l^-^Sl TC rACAAAGA CCCAGACGTG 



1-^401 TCATCCGTCG 



13451 
13501 
13551 
13601 
13651 
13701 

1375;l 



tgcagctgag 
agcccctgat 
ctttccttcc 
aatctagagt 
ccgggagaag 
ctccacag2a 

CACCTGTACT 



^gtgggtgg 
ggggagctaa 
ctggcctcca 
cacctcccaa 
tgcctttgac 
gaggtggcct 

TnnrAAGAAG 



ggagggacct 
tgcactgggt 
ctctggctgg 
cctgctgggg 
ccttggcccc 
ggagagctgc 

AAACGCAGAT 



ggacaacctc 
ccccactctg 
gccaagctct 
acgaccagcc 
agccagcccc 
tgtctccagc 

ArrTACCCTA 



tggctgggcc 
cccctgacct 
gcccccgtgt 
cgcttgctag 
gtgaccttgc 
cgccgcctgt 

CAAGCAGCAG 



13801 
13851 
13901 
13951 
14001 
14051 
14101 
14151 
14201 
14251 
14301 
14351 



gcctgcactg 
ggcggggcac 
tgcagtctgt 
ccgtgcctgg 
cttccccaag 
ggagccctag 
aagtctggcc 
ggcagcctgg 
cagctggatg 
gagcccctca 
gaacactctc 

rTTACrCTGG TGAACTTTGA 



gagtgcctgg 
cacttcctgg 
tcccagcctc 
cagctgtctc 
ctgctgggag 
agagtggtcc 
gtgaggactc 
gtggggtaag 
tggggtcatt 
tggagtaaca 
ccccactcct 



tgagtgtcca 
tcctccctgc 
gtctgtcagg 
acacactctc 
gacaccttgc 
ccccacgtag 
agtctttggg 
ctgacttgcg 
ggaggcttgg 
tgaccagcac 
ctctggcgcc 
atgccccttc 

AGTGGAAAAT 



tctgtccttc 
tgtcctggac 
tctccctggt 
ccagcagcat 
agccacgggc 
aatttcttct 
cagttgttgg 
gtgaagggtg 
acacagaatt 
caacgtccca 
cactcacctt 
ttgcaglCSS 

PTGGCGTACA 



tggggtgggg 
cactcccagc 
catggcatcc 
gcctttgccc 
catcacagcc 
tgccctcact 
ggcggacaga 
gtgggaggtg 
gggggtgata 
ggggcattcc 
ggcagcccag 

TGCTgGTGTG 



14401 C ATGC AGTGG 



14451 

14501 

14551 

14601 

14651 

14701 

14751 

14801 

14851 

14901 

14951 

15QQ1 

15051 

15101 

15151 

15201 

15251 

15301 

15351 

15401 

15451 

15501 



ccgtggcagg 
acagggtgca 
cagtgtggct 
ctgtacgtgt 
ctcggtgagg 
catctgtgtc 
cgtcactaaa 
tactcagctg 
gcaaatcagg 
ggcctccaga 
tcctctgcag 



SCfigtgagtg 
aggtggtgcc 
catgtgcgtg 
cacatgtgtg 
ggtggggggg 
gtgtcttccc 
cctggcaggg 
atcagcctcg 
agaaaaccag 
cagaagggtt 
cctccgggca 

GATTTGCTCT 



gggttgccca 
tcgggggaca 
caacagtgtg 
cgcgcagcag 
ggttgaggaa 
aggaggagtt 
tcttccccaa 
tgagctggca 
agagggtggt 
ggatgcctga 
cctggagacc 

GGGCCGCCAG 



ggaccccggg 
gtacctgccc 
gctcacatgt 
gagagcgagt 
cagggggggt 
gctgggccga 
cacaccctgc 
gggcaaggac 
ggcctgtcct 
ggtcctcctc 
tctcggtatc 
CTTCTATGCC 



catacggctg 
atgaaggcaa 
atgcgtgcaa 
gtgcccgtga 
gtgggtctct 
ctctgccagg 
atgacacctt 
cctgttcctt 
gggctctgag 
ccacccacca 
gcctctgccc 

Pfy-TTTTTCT 



TATPCTACCT PPCCTTCTAP OGCGTC rr^ GGGTGCTGCT CTTCTTTGTT 



igt 

cccccactgc 
gggccagtgg 
agcgtggcag 
gtaagtggcc 
gacactgctc 
caatcctatt 
ctccagcctc 
gcccgagcct 
PAPAGATGAA 



atggcaggga 
agccccccac 
tactgcctcc 
gagcccaggg 
cggtgggtgt 
cattcagatt 
gtacagataa 
cactcagggt 
ccccacagSSS. 

PPACATCCCC 



gtggcgaggt 
cagagcttcc 
ctggcttgct 
tcggtgggtt 
cggagctgct 
ctttaaacac 
ggaagtcaag 
gcctaagtgg 

TPPTGG AAAG 



cacacacagg 
cttttcccgt 
ggtggaatca 
tagggagcgt 
ctggactcag 
tggcaagggg 
gccacttggg 
tgagctggac 

PP^t-TGGTTC 



cgacaggtga 
ctgcagaatg 
cataaacaca 
ggcctggctt 
cctcacagtg 
gcgatggcca 
gacagctgct 
ctagggcagt 
r:Tv;TGGATCA 



AAGGAGATCG GCCACGAGAA 



TGGGTCAGCT 



15601 
15651 
15701 
15751 



tggggggtcc 
cggggctgcc 
cctcctcaca 

PGTr;GAGCCC 



CTCAGataaa 
cagctaggag 
aggtggggga 
tgtgccccgc 
TCACTTTTCA 



cagcaggggt 
ccagatggca 
tggtgccgtg 
cggcttccgg 

PPAAPTGGTT 



ggggcccatc 

aagcagggat 
gggtcaggga 
cagCEGSC^ 

(-AGPGGGCAC 



nrArcGGGAC 
ctgggtgggg 
gaggccctga 
tctgcaacgg 

PPAPPTCCAA 
P^AAPTTCC 



15801 AGATCGAGCA 

15851 gggggtcctg 



££^gtgagtg 
ggaggggatc 



tgggtgctgg 
ctgggagggg 



gggccagtgg 
acccgtgggt 



gaggtgggga 
ggggcctctc 



FIG.2F 
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15901 
15951 
16001 



16051 



16101 
16151 
16201 
16251 
16301 
16351 
16401 
16451 
16501 
16551 
16601 
16651 
16701 
16751 
16801 
16851 
16901 
16951 
17001 
17051 
17101 
17151 
17201 
17251 
17301 
17351 
17401 
17451 
17501 
17551 
17601 
17651 
17701 
17751 
17801 
17851 
17901 
17951 
18001 
18051 
_ 3.81Q1 
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tctggaatct cccacttcag gtgccagcat acgctcccca cccccagCCE 
rTTTCCCAGr. ATY^rrnAG Ar arAACTArftr; rracGrGcrr CCGCTGGTCA 

AGTCGrTGTG TGrrAAGCAC GGrCTCAGCT ArGftftOTT,?^ rTTCCTTCCTC 
ArcGCGCTGG TGGArATCGT £Mgtgaggc tgcagcccgg cccctctgtc 
ctggtggctt ccccagggcc tatgcctacc. cttgtccagg tcagcctoat 
gctgagcccc cagggtccct gagcctttct gtccacgtcc catgcccttc 
ctcccttccc cagccttcac gcacacagtg agaatttctg gagcacctac 
tgcagactca caaacagcag tgcctgcggt gagcaggtct atgcaaacct 
acccccaaag gctgagggaa aaaagctaac agatccagtt tctcagaagg 
aaacacttaa cagggactca taaacagaag ccatgtctca gggccgggtg 
cggtggctca cgcctgtaat tccagcactt ggggaggctg aggtgggcgg 
atcacttgag gtcaggagtt cgagaccagc ctggccaaca tggtgaaacc 
ccgtctctac taaaaaaaaa aaaaaaaaac aaaacaaaac aaaaattagc 
tgggtgtggt ggcaggtgcc cataatccca gctacttggg aggctgaggg 
aggagaatca cttgaactcg caggggcaga ggttgcagtg agctgagatt 
gtgcctttgc agtccagcct gggcaacaga gcaagactct ctcaaaaaca 
aacaaaaaaa ccatgtctca ggcagccaag agttgggaca tcccctcaca 
cgccctctag aaagaaccct ctatatagca agcttttagg gtgaacccca 
tgcaggtggt tcttatgaac ctggtgacca ctggaggtta gataagcgtc 
tacaagagga ggttatctait gccatgagct tggcattcag ggtcaagcat 
cggtcatcag acagttttgc ttgaagatgg cattgccctt gtagcaatgc 
aggctctaga gagcttcctg ccctcttgga gctgotgttc cttccagcaa 
aggaaacagc aagcaattaa aataacaaat aagtacatta cagaagatgg 
gcaaaagaac aatgaaaagc ccctcagggt ggggacaggg gaggggaggg 
gggcggccag gcaggggcgg cagtttctaa ataggtggta gggtgggcag 
tattgacagg ctgacgtgtg agcagggaca gggaggaggg gagaggtctc 
gccacaggga catctggcaa agagcgttca ggcagagggc acttgaccct 
gaatgccaag ctcatggcat agatagccga ggcaggcatg caggcactca 
gagaagggac acgcccggct tgcatcttgg aaagctgccc ctactgggaa 
tgactggcgg gcaggagtcg aagtggaaaa ggagagcaga ggacactgca 
gccatccagg cgaggggtga tggggctcag cccttgtggt caccttggag 
gtggggaaca gaggccagat tccaggtctt atacctctgc gcctttgtac 
acgctgttcc ccttacttgg ttgcccttcc ttcctgtgct ggtgttcaga 
tgcccacttc tccttcatga tctctcccag cctgatgctc tgagcccctg 
ccatttggca cagcccttta gagcgcctgg cacagggctt cctagcagat 
tgttgacatt tctggctcca ctgcccaata tcaggcccaa gatcgggtgg 
gcaggttcca cgtcctctct gtccttgggt tgcagcgccc agcaggaggc 
agcaatggag aactgggtgc aggagggaca ggcccaccca ggctcatgcc 
tggacttggc cttggctgcc ctccagctcc cctacccgac acccgtcacc 
ccggtctaga ttccattcca gagaatgagc attcagctgt tctcccaacc 
caccctccag cccgcatcgc tgcctgcccc cagggaaggg aacccacagg 
gaatggggat ctccgctcac acttaccatg ggggatacag gggtgttagg 
atcttgcaac tgagctccta acacccaccc ccactgccac ccccacctcc 
cag<ST£C£IS aagaagtctg gtgacatctg GCTGGACqCC TftCCTCCATC 
AGTGAAGGC A ArACCCAGGC GGqC ftGAGAA GGnfTTftSgS rftCCAGCAAC 

TTGATACCcr rarrrcTCrA CTGGCCAGCC 



1Rl';i CA AGCrAGTC rrCGGCGGGA 



ir;^Q1 Txy^GgGTorr rTxycTffrrr TTPTOTftrT [^rrqjlZIS^ ^ri^A^jj 
iR^si CTCA rATGTG TATTTAGTAG CCCTATqGrr TTGGCTTTGr. r<QTQhTSQQ 

Ift^ni ACAGGGGTAG AGGGAAGGT G AGrATAGCI^r ATTTTrCTAG AGCGAGftATT 
Ift^Sl GGGGGAAAGG TGTTATTTTT A T ATTAAAAT ArATTCACftT ^TftTTATSgA 

laao] SI 



FIG.2G 
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1 CTTCGCTTCCCTCGGGGTciTXXrTCGGACCTCGGCCACCGCCTGGGATCC 50 

51 CCAGGACTCGTGCGTCCAGCATGGGCGGCGTCGGGGAGCCGGGACCGCGG 100 

1 MGGVGEPGPR 10 

101 GAGGGACCC(k:GCAGCCGGGGGCACCGCTGCCCACCTTCTGCTGGGAGCA 150 



IIEGPAQPGAP LPTFCWEQ 



27 



151 GATCCGCGCGCACGACCAGCCCGGCGACAAGTGGCTGGTCATCGAGCGCC 200 

28 IRA HDQPGDKWLVIERR 44 

201 GCGTCTACGACATCAGCCGCTGGGCACAGCGGCACCCAGGGGGCAGCCGC 250 

45 VYDISRWAQRHPGGSR 60 

, . • 

251 CTCATCGGCCACCACGGCGCTGAGGACGCCACGGATGCCTTCCGTGCCTT 300 

61LIGHHGAEDATD AF RAF 77 

301 CCATCAAGATCTCAATTTTGTGCGCAAGTTCCTACAGCCCCTGTTGATTG 350 

78 HQDLNFVRKFLQPLL IG 94 

351 GAGAGCTGGCTCCGGAAGAACCCAGCCAGGATGGACCCcixSAATGCGCAG 400 

95 ELAPEEPSQDGPLNAQ 110 

401 CTGGTCGAGGACTTCCGAGCCCTGCACCA<kx:AGCCGAGGACATGAAGC'r 450 

111 LVEDF RALHQAAEDMKL 127 

^ , . • 

451 GTTTGATGCCAGTCCCACC*nCTTTGCTTTCCTACTGGGCCACATC^ 500 

128 FD AS PTFFAFLIiGH I LA 144 

• 

501 CCATGGAGGTGCTGGCCTGGCTCCTTATCTACCTCCTGGGTCCTGGCTGG 550 

145 MEVLAWLLIYLLGPGW 160 

^ , • - 

551 GTCCCCAGlxkcCTCGCCGCCTTCATCCTGGCCATCTCTCAGGCTCAGTC 600 
161VPSALAAFILAISQAQS 177 

, • • 

601 CTCGTGTCTGCAGCATGACCTGGGCCATGCCTCCATCTTCAAGAAGTCCT 650 
178 WCLQHDLGHAS IFKK SW 194 



651 GGTGGAACCACGTGGCCCAGAAGTTCGTGATGGGGCAGCTAAAGGGCTTC 700 
195 WNHVAQKFVMGQLKGF 210 

FIG.3A 



4 



WO 00/21557 PCTAJS99/23253 

10/19 

701 TCCGCCCACTCGTGGAACTTCCGCCACTTCCAGCACCACGCCAAGCCCAA 750 

211SAHW WNFRHFQHHAKPN 227 

751 CATCTTCCACAAAGACCCAGACGTGACGGTCGCGCCCGTCTTCCTCCTGG 800. 

228 IFHKDPDVTVAPVFLLG 244. 

801 GGGAGTCATCCGTCGAGTATGGCAAGAAGAAACGCAGATACCTACCCTAC 850 

245 ESSVE YG KKKRRYLPY 260 

851 AACCAGCAGCACCTCTACTTCTTCCTGATCGGCCCGCCGCTGCTCACCCT 900 

261NQQHLYFFLIGPPL,LTL 277 

901 GGTGAACTTTCAAGTCGAAiATCTGGCGTACATGCTGGTGTGCATCCAGT 950 

278 VNFEVENLAYMLVCMQW 294 

951 GGGCGGATT^CTCTCGGCCGCCAGCTTCTATGCCCGCTTCTTCTTATC 1000 

295 ADLLWAASF YARF FLS 310 

1001 TACCTCCCC-nCTACGGCGTCCCTGGGGTCCTGCTCTTCTTO 1050 

311YLPFYGVPGVLLFFVAV 327 

1051 CAGGGTCCTGGAAAGCCACTCGTTCGTGTGGATCACACAGATGAACCACA 1100 

328 RVLESHWFVWITQMNHI 344 

1101 TCCCCAAGGAGATCGGCCACGAGAAGCACCGGGACTGGGTCAGCTCTCAG 1150 

345 PKEIGH EKHRDWVSSQ 360 

1151 CTCGCAGCCACCTGCAACGTXXSAGCCCTCACTTTTCACCAACTGGTT^ 1200 

361 LAATCNVEPSLFTNWF S 377 

1201 CGGGCACCTCAACTTCOVGATCGAGCACCACCTCTTCCCCAGGATGCCGA 1250 

378 GHLNFQIEHHLFPRMPR 394 

1251 GAC^CAACTACAGCCGGGTGGCCCCGCTGGTCAAGTCGCTCTCTG^ 1300 

395 HNY S RVAPLVKS LC AK 4J.U 



1301 CACGGCCTCAGCTACGAAGTCAAGCCCTTCCTCACCGCGCTGGTGGACAT 1350 

411HGLSYEVKPFLTALVDI 427 

1351 CGTCAGGTCCCTCAAGAAGTCTGGTGACATCTGGCTGGACGCCTACCTCC 1400 

428 VRSLKKSGDIWLDAYLH 444 

FIG.3B 
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1401 ATCAGTCAA<kx:AACACCCAGGCGGGCAGAGAAGGGCTcAGGGCACCAGC 1450 

445 Q 

1451 AACCAAGCCAGCCCCCGGCaGGATCGATA<i:CCCCACCCC'k:CACTGGCC^ 1500 

1501 GCCTCGGGG-iGCACTGCCTCCCCTCCTCGTACTCTTGTC'kcCCC^ 1550 

1551 CCCCTCACATCTCTATTCAGCAGCCCTAT^CCTTGGCTCTGGGCCTG^^ 

1601 GGGACAGGG^TAGAGGGAAGGTGAGCATAGC^CATTTTCCTAGAGCGAGA 



1651 



1600 
1650 



ATTOGGGGAAAGCTCTTATTTOATATrAAAATACATTCAGATGTAAA^ 1700 



FIG.3C 
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1 GTACAGCGGCAATGGGCGGTCTCGGGGAGCCCGGAGGGGGACTCGGGCCG 50 

1 MGGVGEPGGGLGP 13 

51 CGGGAGGGGCCCGCACCGCTCGGGGCGCCCCTACCCATC'TOCGCTGGG^ 100 

14REGPAPLGAPLPIFRWE 30 

101 GCAGATCCGCCAGCATCACCTACCAGGCGACAAGTGGCTGGTCATCGAGC .150 

31 QIRQHDL.PGDKWLVIER 4/ 

151 GCCGTCTCTACGACATCAGCCGCTGGGCACAGCGGCACCCAGGGGGTAGC 200 

48. RV YDISRWAQRHP GG S «JJ 



201 CGCATCATCGGCCACCACGG 220 
64 R I I G H H 69 



FIG.4 
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PROFILESCAN of : CYB5rp_correcl_prolein check: 5714 from: 1 to: 445 

GETSEQ from bmd. December 2. 1997 14:20. 

Corapore to profile librory: GenRunOolo:prof i lescon.f il 



Profile: prof i ledir:cytochrome-b5.prf 

Gop weight: 4.50 Gop Length weight: 0.05 

Ave motch: 0.27 Ave mismotch : -0.21 
(Peptide) PROFILEMAKE v4.40 of: 0191.Msf2{»} Length: 48 

Sequences: 24 MoxScore: 27.58 December 2. 1992 00:07 
This profile is derived from PROSITE releose 10.0 ond hos been tested 
by 0 dolobose search ogoinst SWISS-PROT releose 26.0. A comparison 
of the SWISS-PROT onnototion ond the results of the dotobose seorch follows. 
For further informotion oboul this molif, consult the . . . 

Profile: prof i led ir:cytochromeJ)5.prf olignment: 1 

Quolity: 20.77 Gaps: 0 
Ratio: 0.43 Length: 48 
Normalized quolity: 2.91 

S 31 HIXFGDKWLVIERRVYDISI^AQRHPGGSRLIGHHGAEOATDAFRAFH 78 

|: ..: Mil. .|||:::| • Mil- I ll hlM ^ 
P 1 HNDGEETWLWNGQVYOITKFLEEHPGGPDVltCAAGTOATEEFEAlH 48 

t************* *«*«•******« ***«*«*****••••************* 

•Cytochrome b5 family, herae-binding dome in signoture ♦ 
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(g] pir:s68358 hypothelicol protein - common sunflower 
Length = 458 

Score = 169 (79.4 bits). Expect = 2.8e-42. Sum P(4) = 2.8e-42 
Identities = 31/85 (36%). Positives = 49/85 (57%) ^px 3 

Querv 348 IGHEKHRDWVSSQLMTCNVEPSLFTNVTSGHLNFlQim 407 

4G K 4*1 Q T++ S + -HNF G L FQ+EHttl^PR+PR + ++P+ + L 
Sb jet : 348 VGPPKGDNVf EKQTRGTIDlACSSWMDIIIFFGGLQFbLEH^ 407 

Query: 408 CAKHGLSYEVKPFLTALVDIVRSLK 432 

C K+ L Y F A V +++L+ 
Sb jet : 408 CKKYNLPYVSLSFYDANVTTLKTLR 432 

• Score = 133 (62.5 bits). Expect = 2.8e-42. Sum P(4) = 2.8e-42 

Identities = 21/53 (39%). Positives = 35/53 (66%) 

HPGG motif 

Querv 26 EQIRAmQPGDl<WLVIERRVYDISRWAQFfHPGaSRLIGHHGAEOATDAFRAFH 78 

++++H+PD#fI +VY4+h WA+ HPGGI ++ -H) TDAF AFH 
Sb jet : 22 KELKKHNI^NDLWISILGKWNVTEWAKEkfCgDAPL INLAGQDVTDAF I AFH 74 

Score = 118 (55.5 bits). Expect = 2.8e-42. Sum P(4) = 2.8e-42 
Identities = 25/76 (32%). Positives = 34/76 (44%) ^.^ ^^^^ 



Query: 
Sbjct: 
Query: 
Sbjct: 



165 LAAF ILAISQAQSWCL(fflG^l^SIFKKSVM^HVAQKF^M^QLKGF^ 

L+ IL ++ Q L}iD GH + WN A F+ + G S VyW H 
152 LSGAILGLAVM)lAYLGyDAGfifQ^•llATRGW^ 




225 KPNIFHKDPDVTVAPV 240 

N DPD+ P+ 
212 ACNSLDYDPDLQHLPM 227 



Score = 34 (16.0 bits). Expect = 2.8e-42. Sum P(4) = 2.8e-42 
Identities = 7/14 (50%). Positives = 9/14 



FIG.7A 
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[£] Q p:bou79010 1 PID: q 2062403 Borago officinalis del to 6 desolurose mRNA. 
complete cds. (gb:U79010) (NID:2O62402) 
Length = 448 

Score = 179 (84.1 bits). Expect = 2.3e-42. Sum P(3) = 2.3e-42 

Identities = 34/87 (39%). Positives = 48/87 (55%) 

' His box 3 

Query: 
Sbjct: 
Query: 
Sbjct: 



348 IGHEKHRDWVSSQLMTCNVEPSLFTNWFSGHLNipEiiiFPRMPRHNYSRVAPLVKSL 407 

4G K +W Q T -H- + +V«F G L FQlEHhlFP+MPR N +HP V L 
338 VGKPKGNNVIFEKQTlX;TLDISCPPVM)WFHGGLQFlQlEHaLFPK^^^ 397 



408 CAKHGLSYEVKPFLTALVDIVRSLKKS 434 

C KH L Y FA +R+L+ + 
398 CKKHNLPYNYASFSKANEMTLRTLRNT 424 



Score = 144 (67.7 bits). Expect = 2.3e-42. Sum P(3) =2.3e-42 

Identities = 23/53 (43%). Positives = 36/53 (67%) 

HPGG MOTIF 



Query: 26 EQlRAHDQPGDKWLVIERRVYDlSRWAQffiPG^RLlGHHGAEDATDAFRAFH 78 

++++ HD+PGD W+ 1+ + YD+S W + HPGGS + ++ TDAF AFH 
Sbjct- 12 DELKNHDKPGDLWISIQGKAYDVSDWVKliFGgSFPLKSLAGQEVTDAFVAF 64 



Score = 105 (49.3 bits). Expect = 2.3e-42. Sum P(3) = 2.3e-42 
Identities = 22/68 (32%). Positives = 28/68 (41%) 

His box 1 H'S box 2 

Query: 
Sbjct: 
Query: 
Sbject 



176 QSWCLCjliMsiFKKS^ 235 

QS + HD gH + S N F L G S WW + Hh N DPD+ 
153 QSGWIcimiifMWSOSRLNKFM^ 212 

236 TVAPVFLL 243 

p ++ 
213 QVIPFLW 220 



FIG.7B 
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i^ pir:s35157 Delto(6)-desolurqse - Synechocyslis sp. 
Length = 359 

Score =126 (59.2 bits). Expect = 9.0e-09. Sum P(2) = 9.0e-09 
Identities = 21/54 (38%). Positives = 33/54 (51%) 

His box 3 

Query 372 FTI^SGHLNFlom^LFPRMPRHNYSRVAPLVKSLCAKHGLSYEVKPFLTALV 425 

F NMF G LN Q+ \k.fP + +Y ++ -H-K +C + G+ Y+V P A + 
Sbjct- 292 FWNWFCGGLNH^^^^LFPNICH1HYPQLENIIKDVCQEFGVEYKVYPTFKAAI 345 



Score = 36 (16.9 bits). Expect = 9.0e-09. Sum P(2) 
Identities = 6/15 (40%). Positives = 8/15 (53%) 

His box 2 
Query: 209 GFSAHVWWJFlfiFQHH[^ 

G S+ W +RH H 
Sbjct: 113 GLSSFLWRYltiNYLH] 127 



= 9.0e-09 



FIG.8 
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1 2 3 4 5 6 7 8 9 




1. Heart 6. Skeletal Muscle 

2. Brain 7. Kidney 

3. Placenta 8. Pancreas 

4. Lung 9. Retina 

5. Uver 



FIG.9A 
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1 234 56 7 8 9 LpcR Morker 



1. Heart 

2. Brain 

3. Placenta 

4. Lung 

5. Uver 



6. Skeletal Muscle 

7. Kidney 

8. Pancreas 

9. Retina 
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